First purified recombinant CYP75B including transmembrane helix with unexpected high substrate specificity to (2R)-naringenin

Anthochlor pigments (chalcones and aurones) play an important role in yellow flower colourization, the formation of UV-honey guides and show numerous health benefits. The B-ring hydroxylation of chalcones is performed by membrane bound cytochrome P450 enzymes. It was assumed that usual flavonoid 3′-hydroxlases (F3′Hs) are responsible for the 3,4- dihydroxy pattern of chalcones, however, we previously showed that a specialized F3′H, namely chalcone 3-hydroxylase (CH3H), is necessary for the hydroxylation of chalcones. In this study, a sequence encoding membrane bound CH3H from Dahlia variabilis was recombinantly expressed in yeast and a purification procedure was developed. The optimized purification procedure led to an overall recovery of 30% recombinant DvCH3H with a purity of more than 84%. The enzyme was biochemically characterized with regard to its kinetic parameters on various substrates, including racemic naringenin, as well as its enantiomers (2S)-, and (2R)-naringenin, apigenin and kaempferol. We report for the first time the characterization of a purified Cytochrome P450 enzyme from the flavonoid biosynthesis pathway, including the transmembrane helix. Further, we show for the first time that recombinant DvCH3H displays a higher affinity for (2R)-naringenin than for (2S)-naringenin, although (2R)-flavanones are not naturally formed by chalcone isomerase.

by the program TMHMM 31 to constitute the membrane anchor, which is attached to the membrane of the endoplasmic reticulum (Fig. 3). In order to obtain further information on the relationship of DvCH3H to other CYP75B sequences within the Asteraceae family, a BLASTP search using the NCBI database was performed and 50 sequences were obtained after filtering. DvCH3H shows a sequence identity of 70-84% to most of the obtained sequences (see Supplementary Fig. S1), while it shows 90% identity with CsCH3H and 84-87% sequence identity with F3′Hs from Tagetes, Dahlia, Rudbeckia, Zinnia, Bidens, Cosmos and Helianthus (see Supplementary  Table S1). This is also reflected by the corresponding phylogenetic tree (Fig. 2). The enzymes exhibiting chalcone hydroxylase activity, namely the two CH3Hs from Dahlia pinnata (GQ479804 and BDE26439) cluster together with CsCH3H (ACO35755.1). The sequence BDE26439.1 has been released during the review process of this manuscript and is almost identical to GQ479804, with just two mutations within the membrane anchor and missing the last two amino acids of GQ479804. Notably, no other sequence clusters within this branch. However, it has been reported that F3′H from Tagetes erecta (ACO35756.1) showed weak chalcone hydroxylase activity in in vitro activity assays 11 , suggesting that the enzymes of this main branch might possess at least some of requirements necessary for chalcone hydroxylation. The substrate specificity of a CYP is determined by the amino acids within 6 substrate recognition sites (SRS) [32][33][34] (Fig. 3) which are located in the proximity of the heme centre. In DvCH3H the typical LSXXG pattern is present in SRS1, which is presumably required for the hydroxylation of chalcones, as well as the loop region found in CsCH3H (FJ216429) 13 .

Purification.
A purification procedure for the membrane-bound DvCH3H, recombinantly cultivated in controlled bioreactor runs in P. pastoris KM71H (for more details see 40 ), was established by systematic investigation of each purification step. An overview is shown in Fig. 4.

Homogenization and centrifugation.
For yeast cell disruption, it has been reported that mechanical cell disruption techniques such as high pressure homogenization, bead milling or sonication lead to higher recovery yields, when it comes to breakage of the fungal cell wall, in comparison to electrical, enzymatic, physical or chemical disruption methods 41 . We chose high pressure homogenization, as the technique is efficient, scalable and results are reproducible. Previous studies on cell disruption of yeast by high pressure homogenization showed that the number of passages and the pressure had the most significant impact 42 . Therefore, we kept those parameters high (1800 bar and 10 passages) and investigated the impact of the dry cell weight concentration on the homogenization process. The dry cell weight is taken for the calculation of the used biomass as it is much more accurate than the wet cell weight and necessary for the optimization of the homogenization process. For the wet cell weight, the remaining water in the cell pellet depends a lot on the centrifugation process and strongly biased the calculation of the used cell mass. Samples were homogenized at dry cell weight concentrations between 6 and 60 g/L and centrifuged at 20,000× g for 30 min afterwards. As shown in Table 1, there is a slight trend towards less product in the supernatant at higher biomass concentrations. However, as the recovery of target protein at 60 g DCW/L was only 6% lower than at 6 g DCW/L, we decided to persist with the high biomass concentration, as this allowed working with a more concentrated protein solution, reducing the volumes during subsequent purification steps. Nevertheless, in general, around 60% of the target protein was found in the cell pellet, which is discarded after the disruption procedure. We hypothesized that this could result from (I) poorly disrupted cells or (II) a too harsh centrifugation afterwards, which caused partial sedimentation of the target protein. To invesigate the disruption efficiency, colony forming units (CFUs) of frozen and thawed cells were compared to frozen, thawed as well as homogenized cells and we found that the homogenization process only led to a reduction of 55% of CFUs. However, due to cooling limitations, neither the pressure nor the number of passages could be further increased.
Next, it was investigated whether a less harsh centrifugation procedure could lead to higher amounts of target protein in the supernatant. In literature, the seperation of cell debris is usually carried out by centrifugation between 1000 and 15,000× g for 15 min up to 30 min, which is a rather broad range [43][44][45][46][47] . To shed more light on the optimal conditions, g-forces between 1000 and 10,000, and centrifugation times between 5 and 20 min were investigated. The results in Table 1 show that a reduction in the g-force clearly leads to more protein in the supernatant with a recovery of up to 73%. At the same time, it was beneficial to keep the centrifugation time rather high. This can be explained as with shorter centrifugation times, the pellet is more voluminous, causing a lower recovery of target protein in the supernatant. Further reduction of the centrifugation force led to an incomplete separation of unbroken cells.
Ultracentrifugation. After cell disruption, the membrane protein fraction was collected by ultracentrifugation. It is generally recommended to pellet membrane proteins at 100,000-200,000× g for 1-2 h [43][44][45]48 . We investigated the time needed to pellet recombinant DvCH3H at 200,000× g by analysing samples after 30 min and 60 min. As shown in Table 1, already after 30 min, no target protein was detectable in the supernatant anymore. Longer centrifugation times produced a more compact pellet which was more difficult to resuspend and solubilize in the subsequent step. Therefore, 30 min at 200,000× g was chosen for subsequent experiments.

Solubilization and ultracentrifugation.
After pelleting the membrane protein fraction, the target protein needs to be solubilized out of the membrane by detergents. Therefore, we tested solubilisation in six different detergents at 4 °C over night 45 , always at 10 times their critical micellar concentration. The obtained results are shown in Table 1. LDAO solubilized the highest amount of target protein (36%), however, initial purification results showed that the protein was very unstable in this detergent. LDAO is considered a rather harsh detergent due to its zwitterionic nature as its charges can interact with non-hydrophobic parts of the protein potentially causing instability issues. OTG seemed to have a negative influence on protein stability, as a band of very low   www.nature.com/scientificreports/ Chromatography. To further purify the recombinant transmembrane possessing his6-tagged DvCH3H, the solubilized membrane protein solution was loaded on immobilized metal ion affinity chromatography (IMAC) columns. Affinity chromatography can be quite challenging for tagged membrane proteins, as they sometimes bind very poorly to the resin due to the huge detergent micelles, which impair accessibility of the tag 45 . In order to reduce the detergent concentration in the load, the solubilized protein was diluted to a final DDM concentration to 2 mM with detergent free buffer. We were interested in whether a classic particle-based column or a monolithic column was superior for the purification. The results of two purification runs, one on a particle-based column and one on a monolithic column, with identical column volumes and similar metal ion capacities (15 µmol Ni 2+ /mL resin (particle based) and 23 ± 10 µmol Cu 2+ /mL resin (monolithic)) are compared in Table 2. After 1:10 dilution of the load, rather high recoveries of DvCH3H were found, especially on the particle-based column, where 93% could be captured and eluted. On the monolithic column in contrast only 71% of DvCH3H were recovered in the eluate and binding of proteins in general seemed less specific as the purity of the eluate was only 32%. However, the eluate of the particle-based column was also only 56% pure, which is why an additional size exclusion chromatography was carried out. This allowed purification to more than 84%, as determined by analytical size exclusion chromatography (Fig. 5b), SDS-PAGE and Western blot (Fig. 5a).
Overall purification process. The entire purification process is summarized in Table 3. Values marked with an asterisk were theoretically calculated based on indirect measurements, when direct determination was not possible (for details see SI).
The overall recombinant DvCH3H recovery was 30%, with over 60% in each of the five purification steps. The most critical step was the solubilisation of the membrane protein. Advanced optimization with respect to the ionic strength or the pH of the solubilisation buffer might result in higher yields 45 . Table 1. Results of western blots with percentages of recombinant DvCH3H in supernatants (S) and pellets (P). The fractions that were further processed are marked in bold. DDM: n-Dodecyl β-D-maltoside; DM: n-Decyl-β-D-maltoside; CHAPS: 3-((3-Cholamidopropyl) dimethylammonio)-1-propanesulfonate; FC-12: Fos-choline-12; LDAO: Dodecyldimethylaminoxid; OTG: Octylthioglucoside. a OTG fractionated the protein during SDS-PAGE, therefore two different bands were detected on the Western blot. The 12% band was at the correct size whereas the 65% band was much smaller. Average standard deviation of the results is 6.4%. Western blots for the homogenisation and the detergent type are provided in Supplementary Fig. 2  Biochemical characterization. Kinetic properties of recombinant DvCH3H. DvCH3H needs a cytochrome P450 reductase (CPR) as redox partner in order to be catalytically active. Therefore, microsomal preparations of yeast cells expressing Catharanthus roseus NADPH-cytochrome P450 reductase (CrCPR) were included in all activity assays. In the presence of microsomal preparations of recombinantly expressed CrCPR and NADPH as the electron donor, recombinant DvCH3H converted isoliquiritigenin to butein, naringenin to eriodictyol, apigenin to luteolin, kaempferol to quercetin, and dihydrokaempferol to dihydroquercetin. However, isoliquiritigenin and dihydrokaempferol were not only metabolised by DvCH3H to butein and dihydroquercetin, but also by yeast originating enzymes present in the microsome preparation of recombinant CrCPR to unidentified reaction products. Due to the high extent of these unknown products, the kinetic parameters of these substrates would be strongly biased and, therefore, could not be included in the kinetic studies.
For the other substrates, the pH optimum of the recombinant DvCH3H was investigated from pH 5.5 to pH 8.5 (see Supplementary Fig. S4). The pH optimum of DvCH3H for all tested substrates is between 7.0 and 8.0. DvCH3H shows the highest activity for naringenin between pH 7.0 and pH 7.5, for kaempferol between pH 7.5 and pH 8.0 and for apigenin at pH 7.5. For better comparability, the kinetic measurements were performed at pH 7.5. A NADPH concentration of 1.55 mM showed a sufficient excess, without limiting the activity of DvCH3H.
Kinetic analysis of recombinant DvCH3H was performed with the flavonol kaempferol, the flavone apigenin, and the flavanone naringenin. Usually, just a racemic mixture of naringenin is used as a substrate for investigations of enzymes involved in the flavonoid pathway. In the flavonoid pathway, however, only the (2S)-enantiomer  www.nature.com/scientificreports/ is formed by CHI. A racemic mixture can be formed in planta, if the chalcones are isomerized chemically and in the absence of CHI. In this study, we therefore tested the racemic mixture and the pure enantiomers separately. The substrate concentration was varied in a range of 0.25-100 µM (Fig. 6) at a fixed concentration of 1.55 mM of NADPH, and the product formation was quantified by HPLC. The respective values for K M , v max and k cat were calculated from the curves in Fig. 6 and are summarized in Table 4. The kinetic data for apigenin, kaempferol, (2S)-naringenin, (2R)-naringenin and the racemic naringenin show K M values in a range below 1 µM, reflecting a high affinity of recombinant DvCH3H to all substrates ( Table 4). The lowest K M value and the highest catalytic efficiency was observed with apigenin as substrate. In contrast, the highest K M value was obtained with (2S)-naringenin as substrate, however, the catalytic efficiency was slightly higher than the catalytic efficiency for apigenin. Unexpectedly, recombinant DvCH3H displayed a catalytic efficiency for (2R)-naringenin that is comparable to that of kaempferol. Furthermore, (2R)-naringenin showed a near two-fold higher affinity and a slightly higher turnover number than (2S)-naringenin, although it is not the naturally formed enantiomer in the flavonoid pathway.
The K M -and v max value of the racemic naringenin basically reflects the values obtained with (2R)-naringenin, due to the significantly higher affinity and turnover rates of (2R)-naringenin in comparison to (2S)-naringenin. This indicates that recombinant DvCH3H would preferentially transform (2R)-naringenin to (2R)-eriodictyol, if racemic mixtures would be naturally available as substrate. A stereospecific interaction of naringenin enantiomers with various CYPs has been previously reported and is of pharmacological interest as naringenin glycosides are abundantly present in grapefruit juice, which is known to inhibit metabolization of drugs 49 . Interestingly, (2R)-naringenin was previously shown to be converted by the recombinant 2-oxoglutarate dependent flavonol synthase from Citrus unshiu to (−)-trans-dihydrokaempferol, but the subsequent conversion into the corresponding flavonol was not possible 50 . As other enzymes, such as DFR and flavone synthase I, are also stereospecific 51 , (2R)-naringenin conversion does not seem to be of physiological relevance in the latter steps of the flavonoid pathway.
From the natural substrates, the flavone apigenin seems to be the preferred substrate. Compared with the kinetic data reported for CH3H of Cosmos sulphureus 12 , recombinant DvCH3H shows a distinct difference in the flavonoid preferences. Recombinant DvCH3H has the highest catalytic efficiency for apigenin, whereas recombinant CsCH3H has the highest hydroxylation efficiency with kaempferol, apart from the preferred isoliquiritigenin. Unfortunately, chalcones could not be tested with the current test system. However, from the three flavonoid substrates tested, isoliquiritigenin, shows the closest structural similarity to apigenin due to the double bond in the C3-bond connecting rings A and B, and in contrast to the flavonol kaempferol, no hydroxyl group is present at the double bond. Based on these similarities, a high affinity for isoliquiritigenin and a high efficiency in conversion of this substrate can be hypothesized. Activity assays with purified CPR, as well as crystal structure analysis of substrate-enzyme complexes, will provide further insights into the differing binding modes of the different substrates.

Conclusion
This is the first report of the isolation of a CH3H cDNA sequence of dahlia flowers, as well as the recombinant production, purification and biochemical characterization of the enzyme. We developed a purification procedure for the transmembrane helix possessing DvCH3H by step-wise optimization of each downstream processing step. The final procedure allowed an overall recovery of 30%, with more than 60% in each purification step. The obtained DvCH3H was 84% pure and biochemically characterized regarding its pH optimum, as well as kinetic parameters on various substrates, including apigenin, kaempferol and the enantiomers of naringenin. In particular, we were able to detect the hydroxylation of (2R)-naringenin, which is not the enantiomer that is naturally formed by CHI. The results potentially lay the groundwork for future crystallization of DvCH3H, and thus understanding its ability to hydroxylate chalcones in position 3, a feature that is not inherent to the majority of F3′Hs.  Supplementary Table S2. The cloning into the appropriate vectors, the transformation into Pichia pastoris as well as the recombinant expression were all performed as described previously 40 . The cultivated cells were harvested by centrifugation at 7000× g for 20 min at 4 °C, the supernatant was discarded and the cell biomass was frozen at − 20 °C until further use.

Materials and methods
Sequence and phylogenetic analysis. The amino acid sequence of DvCH3H was used as a query for a BLASTP search in the NCBI database. The search was limited to Asteraceae species, the expect value was set to 1e-190 and the maximal target sequences was set to 250 sequences. The obtained sequences were aligned and partial sequences were omitted before the remaining 51 sequences were subjected to phylogenetic analysis by MEGA11 software 52 . The amino acid sequences were aligned using the muscle algorithm with default parameters followed by construction of a phylogenetic tree based on the maximum likelihood (Jones-Taylor-Thornton (JTT) model) with default parameters and 1000 bootstrap replicates. www.nature.com/scientificreports/ The analyses of the different steps were carried out as follows: For the homogenization process, frozen and thawed cells and frozen, thawed and homogenized cells (at the same dry cell weight concentration) were diluted 1:2 to 1: 1,000,000 with sterile homogenization buffer and then streaked on YPD-zeocin plates. The plates were incubated for 3 days at 30 °C before counting the CFUs. For the homogenization and centrifugation efficiency, the cell pellet and supernatant were diluted to the original volume with buffer A and then analysed by use of western blots.
Optimization of the membrane solubilisation. The list of the tested detergents is shown in Table 5. For membrane solubilisation the pellets were redissolved in 1.5 times the original volume with buffer A by mixing with an Ultra-Turrax IKA T10 basic Instrument (IKA, Staufen, Germany). Concentrated detergent stock solution was added to a final concentration of 10 × critical micelle concentration (CMC) for each of the detergents and the solution was swayed on a PMR-30 Compact Fixed-Angle Platform Rocker (Grant Instruments, Royston, UK) at 4 °C for 4 h unless stated otherwise. Then, the mixtures were subjected to ultracentrifugation at 200,000 × g for 30 min. The solubilization efficiency was determined by diluting pellet and supernatant to the original volume with buffer A and subjecting the samples to western blots.  Product identification by LC-MS-MS. Enzymatic assays were performed as described before, except the reactions were stopped by mixing with 10 µL acetic acid and extracted with 70 µL ethyl acetate. After drying, the reaction mixture was dissolved in 40 µL methanol. The reaction products luteolin, eriodictyol and quercetin were identified by high-performance liquid chromatography coupled to mass spectrometry (for more details see SI, Supplementary Table S3 and Supplementary  Fig. S6).
Kinetic analysis. Experiments for determination of kinetic parameters of recombinant DvCH3H were performed by varying the substrate concentration. The amount of enzyme used was 1.25 ng for apigenin and kaempferol and 0.83 ng for racemic naringenin, as well as for (2S)-and (2R)-naringenin. Data analysis was carried out by nonlinear regression, mean values and standard deviations were calculated based on three replicates. Calculations and the graphs were prepared by employing the program OriginPro 2018 (OriginLab).
Ethical approval. The authors confirm that the cDNA clones were obtained from a commercially available Dahlia cultivar. As such, no permissions or licences were required under institutional, national or international guidelines or regulations. The study is in full compliance with the IUCN Policy Statement on Research Involving Species at Risk of Extinction and the Convention on the Trade in Endangered Species of Wild Fauna and Flora, since the commercial varieties used are neither endangered nor at risk of extinction. The cultivar used is listed in the manuscript, including its commercial availability.

Data availability
The generated and analyzed data during the current study is supplied in this manuscript and is readily available from the corresponding authors upon reasonable request.